Overview

Dataset statistics

Number of variables26
Number of observations416
Missing cells908
Missing cells (%)8.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory84.6 KiB
Average record size in memory208.3 B

Variable types

CAT16
NUM8
UNSUPPORTED2

Warnings

STATE has constant value "416" Constant
COUNTRY has constant value "416" Constant
PRODUCTCODE has a high cardinality: 109 distinct values High cardinality
MONTH_ID is highly correlated with QTR_IDHigh correlation
QTR_ID is highly correlated with MONTH_IDHigh correlation
YEAR_ID is highly correlated with ORDERNUMBERHigh correlation
ORDERNUMBER is highly correlated with YEAR_IDHigh correlation
STATUS is highly correlated with ORDERDATEHigh correlation
ORDERDATE is highly correlated with STATUS and 9 other fieldsHigh correlation
QTR_ID is highly correlated with ORDERDATEHigh correlation
YEAR_ID is highly correlated with ORDERDATEHigh correlation
CUSTOMERNAME is highly correlated with ORDERDATE and 6 other fieldsHigh correlation
PHONE is highly correlated with ORDERDATE and 6 other fieldsHigh correlation
ADDRESSLINE1 is highly correlated with ORDERDATE and 6 other fieldsHigh correlation
CITY is highly correlated with ORDERDATE and 6 other fieldsHigh correlation
POSTALCODE is highly correlated with ORDERDATE and 4 other fieldsHigh correlation
CONTACTLASTNAME is highly correlated with ORDERDATE and 4 other fieldsHigh correlation
CONTACTFIRSTNAME is highly correlated with ORDERDATE and 4 other fieldsHigh correlation
ADDRESSLINE2 has 416 (100.0%) missing values Missing
POSTALCODE has 76 (18.3%) missing values Missing
TERRITORY has 416 (100.0%) missing values Missing
PRODUCTCODE is uniformly distributed Uniform
df_index has unique values Unique
ADDRESSLINE2 is an unsupported type, check if it needs cleaning or further analysis Unsupported
TERRITORY is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2020-12-12 09:57:51.156974
Analysis finished2020-12-12 09:58:46.118063
Duration54.96 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct416
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1341.96875
Minimum3
Maximum2807
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:46.624068image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile125.75
Q1676.75
median1348.5
Q32027.25
95-th percentile2619
Maximum2807
Range2804
Interquartile range (IQR)1350.5

Descriptive statistics

Standard deviation792.1587047
Coefficient of variation (CV)0.59029594
Kurtosis-1.164334936
Mean1341.96875
Median Absolute Deviation (MAD)675.5
Skewness0.053092298
Sum558259
Variance627515.4135
MonotocityStrictly increasing
2020-12-12T15:28:47.240360image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
102210.2%
 
204210.2%
 
66710.2%
 
31910.2%
 
134510.2%
 
239210.2%
 
135210.2%
 
135310.2%
 
33010.2%
 
28910.2%
 
Other values (406)40697.6%
 
ValueCountFrequency (%) 
310.2%
 
410.2%
 
510.2%
 
810.2%
 
2910.2%
 
ValueCountFrequency (%) 
280710.2%
 
280010.2%
 
279510.2%
 
278010.2%
 
277910.2%
 

ORDERNUMBER
Real number (ℝ≥0)

HIGH CORRELATION

Distinct45
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10253.22356
Minimum10111
Maximum10421
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:48.530555image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10111
5-th percentile10135
Q110160
median10226
Q310367
95-th percentile10400
Maximum10421
Range310
Interquartile range (IQR)207

Descriptive statistics

Standard deviation96.86876241
Coefficient of variation (CV)0.009447639746
Kurtosis-1.452307836
Mean10253.22356
Median Absolute Deviation (MAD)81
Skewness0.2722781276
Sum4265341
Variance9383.55713
MonotocityNot monotonic
2020-12-12T15:28:49.636688image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%) 
10168184.3%
 
10222184.3%
 
10159184.3%
 
10312174.1%
 
10135174.1%
 
10182174.1%
 
10390163.8%
 
10142163.8%
 
10145163.8%
 
10229143.4%
 
Other values (35)24959.9%
 
ValueCountFrequency (%) 
1011161.4%
 
1011341.0%
 
10135174.1%
 
10140112.6%
 
10142163.8%
 
ValueCountFrequency (%) 
1042120.5%
 
10407122.9%
 
1040092.2%
 
1039681.9%
 
10390163.8%
 

QUANTITYORDERED
Real number (ℝ≥0)

Distinct38
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.01201923
Minimum6
Maximum76
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:50.232984image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile21
Q128
median36
Q344
95-th percentile50
Maximum76
Range70
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.916147219
Coefficient of variation (CV)0.2753566012
Kurtosis0.4791128201
Mean36.01201923
Median Absolute Deviation (MAD)8
Skewness0.3493986235
Sum14981
Variance98.32997567
MonotocityNot monotonic
2020-12-12T15:28:50.740471image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%) 
31215.0%
 
33215.0%
 
49215.0%
 
36194.6%
 
48194.6%
 
39174.1%
 
43163.8%
 
37163.8%
 
38163.8%
 
25153.6%
 
Other values (28)23556.5%
 
ValueCountFrequency (%) 
610.2%
 
1310.2%
 
20153.6%
 
2192.2%
 
2292.2%
 
ValueCountFrequency (%) 
7620.5%
 
6610.2%
 
6420.5%
 
5920.5%
 
5810.2%
 

PRICEEACH
Real number (ℝ≥0)

Distinct218
Distinct (%)52.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.85576923
Minimum27.22
Maximum100
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:51.455265image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum27.22
5-th percentile38.98
Q166.9375
median94.705
Q3100
95-th percentile100
Maximum100
Range72.78
Interquartile range (IQR)33.0625

Descriptive statistics

Standard deviation21.19091213
Coefficient of variation (CV)0.2557566278
Kurtosis-0.4574761942
Mean82.85576923
Median Absolute Deviation (MAD)5.295
Skewness-0.9476293672
Sum34468
Variance449.054757
MonotocityNot monotonic
2020-12-12T15:28:52.168925image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10018243.8%
 
40.2530.7%
 
90.5730.7%
 
64.3320.5%
 
61.1520.5%
 
51.9320.5%
 
61.9920.5%
 
98.6520.5%
 
43.2720.5%
 
36.2920.5%
 
Other values (208)21451.4%
 
ValueCountFrequency (%) 
27.2210.2%
 
29.5410.2%
 
30.5910.2%
 
32.8810.2%
 
33.1910.2%
 
ValueCountFrequency (%) 
10018243.8%
 
99.6610.2%
 
99.5810.2%
 
99.5510.2%
 
99.2110.2%
 

ORDERLINENUMBER
Real number (ℝ≥0)

Distinct18
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.663461538
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:53.091660image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q310
95-th percentile15
Maximum18
Range17
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.412164077
Coefficient of variation (CV)0.6621429495
Kurtosis-0.551924671
Mean6.663461538
Median Absolute Deviation (MAD)3
Skewness0.6032707571
Sum2772
Variance19.46719184
MonotocityNot monotonic
2020-12-12T15:28:53.640689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
14510.8%
 
24210.1%
 
3399.4%
 
4378.9%
 
5337.9%
 
6317.5%
 
7297.0%
 
8276.5%
 
9245.8%
 
10225.3%
 
Other values (8)8720.9%
 
ValueCountFrequency (%) 
14510.8%
 
24210.1%
 
3399.4%
 
4378.9%
 
5337.9%
 
ValueCountFrequency (%) 
1830.7%
 
1761.4%
 
1692.2%
 
1592.2%
 
14112.6%
 

SALES
Real number (ℝ≥0)

Distinct415
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3619.091899
Minimum541.14
Maximum14082.8
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:28:54.359262image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum541.14
5-th percentile1217.6775
Q12203.075
median3244.97
Q34623.255
95-th percentile7351.45
Maximum14082.8
Range13541.66
Interquartile range (IQR)2420.18

Descriptive statistics

Standard deviation1945.958755
Coefficient of variation (CV)0.5376925508
Kurtosis2.862834459
Mean3619.091899
Median Absolute Deviation (MAD)1148.97
Skewness1.316855096
Sum1505542.23
Variance3786755.475
MonotocityNot monotonic
2020-12-12T15:28:56.037082image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4181.4420.5%
 
2718.7210.2%
 
184710.2%
 
2838.8110.2%
 
6763.0510.2%
 
3091.6810.2%
 
3025.9210.2%
 
3958.510.2%
 
3662.5210.2%
 
2760.9410.2%
 
Other values (405)40597.4%
 
ValueCountFrequency (%) 
541.1410.2%
 
717.410.2%
 
834.6710.2%
 
846.5110.2%
 
856.5210.2%
 
ValueCountFrequency (%) 
14082.810.2%
 
11623.710.2%
 
11336.710.2%
 
9661.4410.2%
 
9470.9410.2%
 

ORDERDATE
Categorical

HIGH CORRELATION

Distinct43
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2/17/2005 0:00
 
22
10/10/2003 0:00
 
18
2/19/2004 0:00
 
18
10/28/2003 0:00
 
18
11/12/2003 0:00
 
17
Other values (38)
323 
ValueCountFrequency (%) 
2/17/2005 0:00225.3%
 
10/10/2003 0:00184.3%
 
2/19/2004 0:00184.3%
 
10/28/2003 0:00184.3%
 
11/12/2003 0:00174.1%
 
7/2/2003 0:00174.1%
 
10/21/2004 0:00174.1%
 
3/4/2005 0:00163.8%
 
8/8/2003 0:00163.8%
 
8/25/2003 0:00163.8%
 
Other values (33)24157.9%
 
2020-12-12T15:28:56.770576image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.7%
2020-12-12T15:28:57.633823image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length14
Mean length14.08894231
Min length13

STATUS
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Shipped
389 
Resolved
 
13
On Hold
 
12
In Process
 
2
ValueCountFrequency (%) 
Shipped38993.5%
 
Resolved133.1%
 
On Hold122.9%
 
In Process20.5%
 
2020-12-12T15:28:58.199777image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:28:58.659607image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:59.602472image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length7
Mean length7.045673077
Min length7

QTR_ID
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
1
158 
4
121 
3
95 
2
42 
ValueCountFrequency (%) 
115838.0%
 
412129.1%
 
39522.8%
 
24210.1%
 
2020-12-12T15:29:00.240876image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:00.620790image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:01.368114image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

MONTH_ID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.052884615
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:29:02.064760image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median7
Q310
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)8

Descriptive statistics

Standard deviation3.700484025
Coefficient of variation (CV)0.6113587587
Kurtosis-1.511574605
Mean6.052884615
Median Absolute Deviation (MAD)3.5
Skewness0.03422829123
Sum2518
Variance13.69358202
MonotocityNot monotonic
2020-12-12T15:29:02.561905image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
107117.1%
 
25813.9%
 
15212.5%
 
34811.5%
 
84510.8%
 
7399.4%
 
11307.2%
 
4215.0%
 
12204.8%
 
5163.8%
 
Other values (2)163.8%
 
ValueCountFrequency (%) 
15212.5%
 
25813.9%
 
34811.5%
 
4215.0%
 
5163.8%
 
ValueCountFrequency (%) 
12204.8%
 
11307.2%
 
107117.1%
 
9112.6%
 
84510.8%
 

YEAR_ID
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2003
163 
2004
143 
2005
110 
ValueCountFrequency (%) 
200316339.2%
 
200414334.4%
 
200511026.4%
 
2020-12-12T15:29:03.083456image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:03.690809image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:04.063757image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

PRODUCTLINE
Categorical

Distinct7
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Vintage Cars
125 
Classic Cars
116 
Motorcycles
54 
Trucks and Buses
52 
Planes
37 
Other values (2)
32 
ValueCountFrequency (%) 
Vintage Cars12530.0%
 
Classic Cars11627.9%
 
Motorcycles5413.0%
 
Trucks and Buses5212.5%
 
Planes378.9%
 
Ships245.8%
 
Trains81.9%
 
2020-12-12T15:29:04.692064image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:04.990980image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:05.532929image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length16
Median length12
Mean length11.31730769
Min length5

MSRP
Real number (ℝ≥0)

Distinct80
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.66105769
Minimum33
Maximum214
Zeros0
Zeros (%)0.0%
Memory size3.2 KiB
2020-12-12T15:29:06.184676image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum33
5-th percentile41
Q166
median97
Q3122
95-th percentile170.75
Maximum214
Range181
Interquartile range (IQR)56

Descriptive statistics

Standard deviation41.33318073
Coefficient of variation (CV)0.4147375282
Kurtosis-0.02108227963
Mean99.66105769
Median Absolute Deviation (MAD)29
Skewness0.6364308194
Sum41459
Variance1708.431829
MonotocityNot monotonic
2020-12-12T15:29:06.827192image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
99174.1%
 
60163.8%
 
118153.6%
 
136133.1%
 
62122.9%
 
127112.6%
 
102102.4%
 
101102.4%
 
50102.4%
 
8092.2%
 
Other values (70)29370.4%
 
ValueCountFrequency (%) 
3351.2%
 
3551.2%
 
3741.0%
 
4041.0%
 
4151.2%
 
ValueCountFrequency (%) 
21461.4%
 
20741.0%
 
19420.5%
 
19351.2%
 
17341.0%
 

PRODUCTCODE
Categorical

HIGH CARDINALITY
UNIFORM

Distinct109
Distinct (%)26.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
S18_3320
 
9
S18_1367
 
7
S18_4668
 
7
S24_4258
 
7
S18_1097
 
7
Other values (104)
379 
ValueCountFrequency (%) 
S18_332092.2%
 
S18_136771.7%
 
S18_466871.7%
 
S24_425871.7%
 
S18_109771.7%
 
S18_279571.7%
 
S18_224861.4%
 
S18_313661.4%
 
S12_166661.4%
 
S10_194961.4%
 
Other values (99)34883.7%
 
2020-12-12T15:29:08.216579image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)0.2%
2020-12-12T15:29:09.435810image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length8
Mean length8.084134615
Min length8

CUSTOMERNAME
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Mini Gifts Distributors Ltd.
180 
Corporate Gift Ideas Co.
41 
The Sharp Gifts Warehouse
40 
Technics Stores Inc.
34 
Toys4GrownUps.com
30 
Other values (6)
91 
ValueCountFrequency (%) 
Mini Gifts Distributors Ltd.18043.3%
 
Corporate Gift Ideas Co.419.9%
 
The Sharp Gifts Warehouse409.6%
 
Technics Stores Inc.348.2%
 
Toys4GrownUps.com307.2%
 
Collectable Mini Designs Co.256.0%
 
Mini Wheels Co.215.0%
 
Signal Collectibles Ltd.153.6%
 
Men 'R' US Retailers, Ltd.143.4%
 
West Coast Collectables Co.133.1%
 
2020-12-12T15:29:09.972034image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:10.436147image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length28
Median length27
Mean length24.89182692
Min length15

PHONE
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
4155551450
180 
6505551386
41 
4085553659
40 
6505556809
34 
6265557265
30 
Other values (6)
91 
ValueCountFrequency (%) 
415555145018043.3%
 
6505551386419.9%
 
4085553659409.6%
 
6505556809348.2%
 
6265557265307.2%
 
7605558146256.0%
 
6505555787215.0%
 
4155554312153.6%
 
2155554369143.4%
 
3105553722133.1%
 
2020-12-12T15:29:10.966303image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:11.484226image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

ADDRESSLINE1
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
5677 Strong St.
180 
7734 Strong St.
41 
3086 Ingle Ln.
40 
9408 Furth Circle
34 
78934 Hillside Dr.
30 
Other values (6)
91 
ValueCountFrequency (%) 
5677 Strong St.18043.3%
 
7734 Strong St.419.9%
 
3086 Ingle Ln.409.6%
 
9408 Furth Circle348.2%
 
78934 Hillside Dr.307.2%
 
361 Furth Circle256.0%
 
5557 North Pendale Street215.0%
 
2793 Furth Circle153.6%
 
6047 Douglas Av.143.4%
 
3675 Furth Circle133.1%
 
2020-12-12T15:29:12.183253image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:12.705866image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length25
Median length15
Mean length16.02403846
Min length14

ADDRESSLINE2
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing416
Missing (%)100.0%
Memory size3.4 KiB

CITY
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
San Rafael
180 
San Francisco
62 
San Jose
40 
Burlingame
34 
Pasadena
30 
Other values (5)
70 
ValueCountFrequency (%) 
San Rafael18043.3%
 
San Francisco6214.9%
 
San Jose409.6%
 
Burlingame348.2%
 
Pasadena307.2%
 
San Diego256.0%
 
Brisbane153.6%
 
Los Angeles143.4%
 
Burbank133.1%
 
Glendale30.7%
 
2020-12-12T15:29:13.253201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:13.782864image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:14.726419image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length13
Median length10
Mean length9.903846154
Min length7

STATE
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
CA
416 
ValueCountFrequency (%) 
CA416100.0%
 
2020-12-12T15:29:15.256487image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:15.504704image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:16.416702image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

POSTALCODE
Categorical

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)1.8%
Missing76
Missing (%)18.3%
Memory size3.2 KiB
97562
180 
94217
89 
90003
30 
91217
25 
94019
 
13
ValueCountFrequency (%) 
9756218043.3%
 
942178921.4%
 
90003307.2%
 
91217256.0%
 
94019133.1%
 
9256130.7%
 
(Missing)7618.3%
 
2020-12-12T15:29:17.981910image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:18.461536image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:19.538779image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length4.634615385
Min length3

COUNTRY
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
USA
416 
ValueCountFrequency (%) 
USA416100.0%
 
2020-12-12T15:29:20.759828image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:21.421719image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:21.918371image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

TERRITORY
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing416
Missing (%)100.0%
Memory size3.4 KiB

CONTACTLASTNAME
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Nelson
180 
Brown
41 
Frick
40 
Thompson
38 
Hirano
34 
Other values (4)
83 
ValueCountFrequency (%) 
Nelson18043.3%
 
Brown419.9%
 
Frick409.6%
 
Thompson389.1%
 
Hirano348.2%
 
Young337.9%
 
Murphy215.0%
 
Taylor153.6%
 
Chandler143.4%
 
2020-12-12T15:29:22.550265image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:23.661741image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:25.092821image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length6
Mean length5.975961538
Min length5

CONTACTFIRSTNAME
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Valarie
205 
Julie
92 
Sue
55 
Juri
34 
Michael
 
14
Other values (2)
 
16
ValueCountFrequency (%) 
Valarie20549.3%
 
Julie9222.1%
 
Sue5513.2%
 
Juri348.2%
 
Michael143.4%
 
Steve133.1%
 
Leslie30.7%
 
2020-12-12T15:29:25.841902image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:26.349897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:27.598095image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length5.713942308
Min length3

DEALSIZE
Categorical

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
Medium
208 
Small
181 
Large
27 
ValueCountFrequency (%) 
Medium20850.0%
 
Small18143.5%
 
Large276.5%
 
2020-12-12T15:29:28.410993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T15:29:28.795036image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:29:29.371098image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length5.5
Mean length5.5
Min length5

Interactions

2020-12-12T15:28:12.221795image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:12.588609image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:12.955272image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:13.370868image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:13.766459image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:14.134313image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:14.622803image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:15.009920image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:15.515011image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:15.965845image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:16.694799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:17.303597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:17.701710image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:18.160250image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:18.648622image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:19.145160image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:19.514888image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:19.953495image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:20.361314image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:20.727152image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:21.121456image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:21.985440image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:22.374939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:22.852174image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:23.308219image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:23.732618image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:24.205026image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:24.676914image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:25.124465image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:25.613266image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:26.024993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:26.529019image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:27.019799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:27.436249image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:27.955304image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:28.345245image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:28.993975image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:29.632344image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:30.052201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:30.462334image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:30.838513image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:31.199204image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:31.587084image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:31.972719image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:32.359757image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:32.840537image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:33.304447image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:33.746853image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:34.161632image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:34.570481image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:34.963869image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:35.412403image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:35.841366image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:36.345696image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:36.736698image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:37.192597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:37.651833image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:38.059118image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:38.454623image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:38.948756image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:39.489794image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:40.103580image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:40.763547image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:41.178387image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-12T15:29:30.029826image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-12T15:29:31.043071image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-12T15:29:31.816398image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-12T15:29:33.137374image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-12T15:29:34.672432image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-12T15:28:42.480922image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:44.811855image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T15:28:45.500303image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

df_indexORDERNUMBERQUANTITYORDEREDPRICEEACHORDERLINENUMBERSALESORDERDATESTATUSQTR_IDMONTH_IDYEAR_IDPRODUCTLINEMSRPPRODUCTCODECUSTOMERNAMEPHONEADDRESSLINE1ADDRESSLINE2CITYSTATEPOSTALCODECOUNTRYTERRITORYCONTACTLASTNAMECONTACTFIRSTNAMEDEALSIZE
03101454583.2663746.708/25/2003 0:00Shipped382003Motorcycles95S10_1678Toys4GrownUps.com626555726578934 Hillside Dr.NaNPasadenaCA90003USANaNYoungJulieMedium
141015949100.00145205.2710/10/2003 0:00Shipped4102003Motorcycles95S10_1678Corporate Gift Ideas Co.65055513867734 Strong St.NaNSan FranciscoCANaNUSANaNBrownJulieMedium
25101683696.6613479.7610/28/2003 0:00Shipped4102003Motorcycles95S10_1678Technics Stores Inc.65055568099408 Furth CircleNaNBurlingameCA94217USANaNHiranoJuriMedium
38102012298.5722168.5412/1/2003 0:00Shipped4122003Motorcycles95S10_1678Mini Wheels Co.65055557875557 North Pendale StreetNaNSan FranciscoCANaNUSANaNMurphyJulieSmall
4291014037100.00117374.107/24/2003 0:00Shipped372003Classic Cars214S10_1949Technics Stores Inc.65055568099408 Furth CircleNaNBurlingameCA94217USANaNHiranoJuriLarge
5361021535100.0036075.301/29/2004 0:00Shipped112004Classic Cars214S10_1949West Coast Collectables Co.31055537223675 Furth CircleNaNBurbankCA94019USANaNThompsonSteveMedium
6441031248100.00311623.7010/21/2004 0:00Shipped4102004Classic Cars214S10_1949Mini Gifts Distributors Ltd.41555514505677 Strong St.NaNSan RafaelCA97562USANaNNelsonValarieLarge
7461033326100.0033003.0011/18/2004 0:00Shipped4112004Classic Cars214S10_1949Mini Wheels Co.65055557875557 North Pendale StreetNaNSan FranciscoCANaNUSANaNMurphyJulieMedium
8481035732100.00105691.8412/10/2004 0:00Shipped4122004Classic Cars214S10_1949Mini Gifts Distributors Ltd.41555514505677 Strong St.NaNSan RafaelCA97562USANaNNelsonValarieMedium
9501038136100.0038254.802/17/2005 0:00Shipped122005Classic Cars214S10_1949Corporate Gift Ideas Co.65055513867734 Strong St.NaNSan FranciscoCANaNUSANaNBrownJulieLarge

Last rows

df_indexORDERNUMBERQUANTITYORDEREDPRICEEACHORDERLINENUMBERSALESORDERDATESTATUSQTR_IDMONTH_IDYEAR_IDPRODUCTLINEMSRPPRODUCTCODECUSTOMERNAMEPHONEADDRESSLINE1ADDRESSLINE2CITYSTATEPOSTALCODECOUNTRYTERRITORYCONTACTLASTNAMECONTACTFIRSTNAMEDEALSIZE
4062720101423885.4143245.588/8/2003 0:00Shipped382003Ships99S700_3962Mini Gifts Distributors Ltd.41555514505677 Strong St.NaNSan RafaelCA97562USANaNNelsonValarieMedium
4072727102223195.34172955.542/19/2004 0:00Shipped122004Ships99S700_3962Collectable Mini Designs Co.7605558146361 Furth CircleNaNSan DiegoCA91217USANaNThompsonValarieSmall
4082748101683982.91173233.4910/28/2003 0:00Shipped4102003Planes74S700_4002Technics Stores Inc.65055568099408 Furth CircleNaNBurlingameCA94217USANaNHiranoJuriMedium
4092752102224374.0323183.292/19/2004 0:00Shipped122004Planes74S700_4002Collectable Mini Designs Co.7605558146361 Furth CircleNaNSan DiegoCA91217USANaNThompsonValarieMedium
4102754102503862.19122363.225/11/2004 0:00Shipped252004Planes74S700_4002The Sharp Gifts Warehouse40855536593086 Ingle Ln.NaNSan JoseCA94217USANaNFrickSueSmall
4112779102094844.6932145.121/9/2004 0:00Shipped112004Planes49S72_1253Men 'R' US Retailers, Ltd.21555543696047 Douglas Av.NaNLos AngelesCANaNUSANaNChandlerMichaelSmall
4122780102223145.6971416.392/19/2004 0:00Shipped122004Planes49S72_1253Collectable Mini Designs Co.7605558146361 Furth CircleNaNSan DiegoCA91217USANaNThompsonValarieSmall
4132795104002056.1241122.404/1/2005 0:00Shipped242005Planes49S72_1253The Sharp Gifts Warehouse40855536593086 Ingle Ln.NaNSan JoseCA94217USANaNFrickSueSmall
4142800101423944.2351724.978/8/2003 0:00Shipped382003Ships54S72_3212Mini Gifts Distributors Ltd.41555514505677 Strong St.NaNSan RafaelCA97562USANaNNelsonValarieSmall
4152807102223663.34182280.242/19/2004 0:00Shipped122004Ships54S72_3212Collectable Mini Designs Co.7605558146361 Furth CircleNaNSan DiegoCA91217USANaNThompsonValarieSmall